similar task
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (3 more...)
BioArc: Discovering Optimal Neural Architectures for Biological Foundation Models
Fang, Yi, Xu, Haoran, Han, Jiaxin, Ding, Sirui, Wang, Yizhi, Wang, Yue, Wang, Xuan
Foundation models have revolutionized various fields such as natural language processing (NLP) and computer vision (CV). While efforts have been made to transfer the success of the foundation models in general AI domains to biology, existing works focus on directly adopting the existing foundation model architectures from general machine learning domains without a systematic design considering the unique physicochemical and structural properties of each biological data modality. This leads to suboptimal performance, as these repurposed architectures struggle to capture the long-range dependencies, sparse information, and complex underlying ``grammars'' inherent to biological data. To address this gap, we introduce BioArc, a novel framework designed to move beyond intuition-driven architecture design towards principled, automated architecture discovery for biological foundation models. Leveraging Neural Architecture Search (NAS), BioArc systematically explores a vast architecture design space, evaluating architectures across multiple biological modalities while rigorously analyzing the interplay between architecture, tokenization, and training strategies. This large-scale analysis identifies novel, high-performance architectures, allowing us to distill a set of empirical design principles to guide future model development. Furthermore, to make the best of this set of discovered principled architectures, we propose and compare several architecture prediction methods that effectively and efficiently predict optimal architectures for new biological tasks. Overall, our work provides a foundational resource and a principled methodology to guide the creation of the next generation of task-specific and foundation models for biology.
- North America > United States > Virginia (0.04)
- South America > Peru > Loreto Department (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Overview (0.92)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (3 more...)
Reflection-Based Memory For Web navigation Agents
Azam, Ruhana, Vempaty, Aditya, Jagmohan, Ashish
Web navigation agents have made significant progress, yet current systems operate with no memory of past experiences -- leading to repeated mistakes and an inability to learn from previous interactions. We introduce Reflection-Augment Planning (ReAP), a web navigation system to leverage both successful and failed past experiences using self-reflections. Our method improves baseline results by 11 points overall and 29 points on previously failed tasks. These findings demonstrate that reflections can transfer to different web navigation tasks.
- North America > United States > Illinois (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
UNLEARN Efficient Removal of Knowledge in Large Language Models
Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an important capability. This paper proposes a novel method to achieve this objective called UNLEARN. The approach builds upon subspace methods to identify and specifically target the removal of knowledge without adversely affecting other knowledge in the LLM. Results demonstrate 96% of targeted knowledge can be forgotten while maintaining performance on other knowledge within 2.5% of the original model, significantly outperforming the discriminatory abilities of the previous state-of-the-art. A dual method called LEARN is also proposed for targeted knowledge addition. Results show LEARN can match the fine-tuning accuracy of Low-Rank Adaptation (LoRA) without adversely affecting similar tasks.
- North America > United States > California (0.04)
- Asia > Singapore (0.04)
- North America > United States > Virginia (0.04)
- (2 more...)
- Research Report > Promising Solution (1.00)
- Research Report > New Finding (0.68)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
On the benefits of pixel-based hierarchical policies for task generalization
Cristea-Platon, Tudor, Mazoure, Bogdan, Susskind, Josh, Talbott, Walter
Reinforcement learning practitioners often avoid hierarchical policies, especially in image-based observation spaces. Typically, the single-task performance improvement over flat-policy counterparts does not justify the additional complexity associated with implementing a hierarchy. However, by introducing multiple decision-making levels, hierarchical policies can compose lower-level policies to more effectively generalize between tasks, highlighting the need for multi-task evaluations. We analyze the benefits of hierarchy through simulated multi-task robotic control experiments from pixels. Our results show that hierarchical policies trained with task conditioning can (1) increase performance on training tasks, (2) lead to improved reward and state-space generalizations in similar tasks, and (3) decrease the complexity of fine tuning required to solve novel tasks. Thus, we believe that hierarchical policies should be considered when building reinforcement learning architectures capable of generalizing between tasks.
- North America > United States > Massachusetts (0.04)
- Asia > Middle East > Jordan (0.04)
Locating Cross-Task Sequence Continuation Circuits in Transformers
While transformer models exhibit strong capabilities on linguistic tasks, their complex architectures make them difficult to interpret. Recent work has aimed to reverse engineer transformer models into human-readable representations called circuits that implement algorithmic functions. We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of digits, number words, and months. Through the application of circuit analysis techniques, we identify key sub-circuits responsible for detecting sequence members and for predicting the next member in a sequence. Our analysis reveals that semantically related sequences rely on shared circuit subgraphs with analogous roles. Overall, documenting shared computational structures enables better prediction of model behaviors, identification of errors, and safer editing procedures. This mechanistic understanding of transformers is a critical step towards building more robust, aligned, and interpretable language models.